AITopics | weighted sampling

Collaborating Authors

weighted sampling

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Deanna Needell, Rachel Ward, Nati Srebro

Neural Information Processing SystemsOct-3-2025, 02:31:46 GMT

Neural Information Processing Systems http://nips.cc/

randomized kaczmarz algorithm, stochastic gradient descent, weighted sampling

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Neural Information Processing SystemsSep-30-2025, 10:14:03 GMT

randomized kaczmarz algorithm, stochastic gradient descent, weighted sampling, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.63)

Add feedback

Transfer Learning and Mixup for Fine-Grained Few-Shot Fungi Classification

Tam, Jason Kahei, Gustineli, Murilo, Miyaguchi, Anthony

arXiv.org Artificial IntelligenceJul-14-2025

Accurate identification of fungi species presents a unique challenge in computer vision due to fine-grained inter-species variation and high intra-species variation. This paper presents our approach for the FungiCLEF 2025 competition, which focuses on few-shot fine-grained visual categorization (FGVC) using the FungiTastic Few-Shot dataset. Our team (DS@GT) experimented with multiple vision transformer models, data augmentation, weighted sampling, and incorporating textual information. We also explored generative AI models for zero-shot classification using structured prompting but found them to significantly underperform relative to vision-based models. Our final model outperformed both competition baselines and highlighted the effectiveness of domain specific pretraining and balanced sampling strategies. Our approach ranked 35/74 on the private test set in post-completion evaluation, this suggests additional work can be done on metadata selection and domain-adapted multi-modal learning. Our code is available at https://github.com/dsgt-arc/fungiclef-2025.

large language model, machine learning, mixup, (19 more...)

arXiv.org Artificial Intelligence

2507.08248

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Neural Information Processing SystemsJan-18-2025, 11:46:18 GMT

We improve a recent gurantee of Bach and Moulines on the linear convergence of SGD for smooth and strongly convex objectives, reducing a quadratic dependence on the strong convexity to a linear dependence. Furthermore, we show how reweighting the sampling distribution (i.e. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods.

artificial intelligence, machine learning, randomized kaczmarz algorithm, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Weighted Sampling for Masked Language Modeling

Zhang, Linhan, Chen, Qian, Wang, Wen, Deng, Chong, Cao, Xin, Hao, Kongzhang, Jiang, Yuxin, Wang, Wei

arXiv.org Artificial IntelligenceMay-24-2023

Masked Language Modeling (MLM) is widely used to pretrain language models. The standard random masking strategy in MLM causes the pre-trained language models (PLMs) to be biased toward high-frequency tokens. Representation learning of rare tokens is poor and PLMs have limited performance on downstream tasks. To alleviate this frequency bias issue, we propose two simple and effective Weighted Sampling strategies for masking tokens based on the token frequency and training loss. We apply these two strategies to BERT and obtain Weighted-Sampled BERT (WSBERT). Experiments on the Semantic Textual Similarity benchmark (STS) show that WSBERT significantly improves sentence embeddings over BERT. Combining WSBERT with calibration methods and prompt learning further improves sentence embeddings. We also investigate fine-tuning WSBERT on the GLUE benchmark and show that Weighted Sampling also improves the transfer learning capability of the backbone PLM. We further analyze and provide insights into how WSBERT improves token embeddings.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2302.14225

Country:

Oceania > Australia > New South Wales (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.85)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)

Add feedback

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

Needell, Deanna, Ward, Rachel, Srebro, Nati

Neural Information Processing SystemsFeb-14-2020, 07:12:17 GMT

randomized kaczmarz algorithm, stochastic gradient descent, weighted sampling, (3 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Weighted Sampling for Combined Model Selection and Hyperparameter Tuning

Sarigiannis, Dimitrios, Parnell, Thomas, Pozidis, Haris

arXiv.org Machine LearningSep-17-2019

The combined algorithm selection and hyperparameter tuning (CASH) problem is characterized by large hierarchical hyperparameter spaces. Model-free hyperparameter tuning methods can explore such large spaces efficiently since they are highly parallelizable across multiple machines. When no prior knowledge or meta-data exists to boost their performance, these methods commonly sample random configurations following a uniform distribution. In this work, we propose a novel sampling distribution as an alternative to uniform sampling and prove theoretically that it has a better chance of finding the best configuration in a worst-case setting. In order to compare competing methods rigorously in an experimental setting, one must perform statistical hypothesis testing. We show that there is little-to-no agreement in the automated machine learning literature regarding which methods should be used. We contrast this disparity with the methods recommended by the broader statistics literature, and identify the most suitable approach. We then select three popular model-free solutions to CASH and evaluate their performance, with uniform sampling as well as the proposed sampling scheme, across 67 datasets from the OpenML platform. We investigate the trade-off between exploration and exploitation across the three algorithms, and verify empirically that the proposed sampling distribution improves performance in all cases.

artificial intelligence, hyperparameter, upstream oil & gas, (20 more...)

arXiv.org Machine Learning

1909.0714

Country:

Europe > Switzerland (0.14)
Europe > Spain (0.14)

Genre: Research Report > Experimental Study (0.48)

Industry:

Education (0.34)
Energy > Oil & Gas > Upstream (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.84)

Add feedback